Ian & Stuart's Australian Mac: Not for Sale

home *** CD-ROM | disk | FTP | other *** search

/ Ian & Stuart's Australian Mac: Not for Sale / Another.not.for.sale (Australia).iso / hold me in your arms / Media Lab / Project List 9⁄93 < prev next >

Wrap

Text File | 1993-12-15 | 72KB | 2,195 lines

RESEARCH PROJECTS IN THE MEDIA LABORATORY I. LEARNING & COMMON SENSE 1 1. Children and Machines 1 2. Memory-Based Representation 1 3. Understanding News 1 4. Iconic Stream-Based Video Logging 2 5. Storyteller Systems 2 6. FRAMER: Knowledge Description and Sharing 2 7. Graphics by Example 3 8. Graphics for Software Visualization 3 9. The Berlin Wall of Programming 3 10. Intelligent Technical Documentation 3 11. Graphical Annotation 3 12. Instructible Agents 3 13. Agent-Application Communication 4 14. Autonomous Agents 4 15. Interface Agents 4 16. Editors, Agents, and Butlers 4 17. Society of Mind 5 18. Animal Construction Kits 5 19. Structure out of Sound 5 20. Constructionism 6 21. Robot Design Competitions 6 22. Project Headlight 6 23. Learning in Multicultural Settings 6 24. Science and Whole Learning Teachers' Collaborative 6 25. Electronic Communication 7 26. Children as Designers 7 27. Games 7 28. Study of Mathematical Thinking7 29. Thinking and Learning about Systems 7 30. Ubiquitous Computing for Kids 7 31. New Visions of Programming in Education 8 32. Learning in Virtual Communities 8 II. PERCEPTUAL COMPUTING 8 33. Mid-Level Vision 8 34. X-Y-T Image Analysis 8 35. Analysis of Egomotion Using Wide Angle Vision 8 36. Modeling and Tracking People 9 37. Dynamic Scene Annotation 9 38. Multimodal Natural Dialog 9 39. Advanced Interactive Mapping Displays 9 40. Information Appliances 10 41. Structure out of Sound 10 42. Looking at People 10 43. Model-Based Image Coding 10 44. Video Databases: Indexing by Content 10 45. Image Query by Texture Content11 46. Nonlinear Space-Time Texture Models 11 47. Semantic Image Modeling 11 48. Computers and Telephony 11 49. Desktop Audio 11 50. Voice Interfaces to Hand-Held Computers 12 51. Voice Hypermedia 12 52. Telephone-Based Voice Services12 53. Synthetic Performers 12 54. Synthetic Listeners 12 55. Synthetic Spaces 12 56. Cognitive Audio Processing 13 57. Structured Audio Transmission13 III. INFORMATION & ENTERTAINMENT 13 58. Salient Stills 13 59. Color Semantics 13 60. Knowing the Individual 13 61. Interactive Computation of Holographic Images 14 62. Scaled-Up Holographic Video 14 63. Holographic Laser Printer 14 64. Immersive Projected-Image Holographic Displays 14 65. Medical Image Holography 14 66. Edge-Lit Holograms 15 67. Open Architecture Television 15 68. Cheops: Data-Flow Television Receiver 15 69. Motion Modeling for Video Coding 15 70. Production, Distribution, and Viewing of Structured Video Narratives 16 71. Multimedia Testbed 16 72. Computationally Expressive Tools 16 73. Large-Scale, High-Resolution Display Prototypes 18 74. Input/Output Considerations 18 75. Advanced Interactive Mapping Displays 18 76. Experiments in Elastic Media 19 77. Video Editing: Computational Partnerships 19 78. Stories with a Sense of Themselves 20 79. Directing Digital Video: New Tools 21 80. Storyteller Systems 21 81. Production, Distribution, and Viewing of Structured Video Narratives 21 82. Real-Time Modeling 21 83. Interface Sensors and Transducers 22 84. Information, Computation, and Physics 22 85. Incremental Coding 23 86. Movies via Modems 23 87. Objective Coding 23 88. Dimensionalization 23 89. Casual Collaboration 24 90. Structure out of Sound 24 91. Hyperinstruments 24 RESEARCH The ongoing research of the Media Laboratory extends across a wide realm of activities, which may be clustered into three broad areas: LEARNING & COMMON SENSE, PERCEPTUAL COMPUTING, and INFORMATION & ENTERTAINMENT. I. LEARNING & COMMON SENSE 1. Children and Machines (Professor Edith Ackermann) Several projects involve children's conceptions of machines. One project focuses specifically on how young children describe and understand the functioning of simple machines. Another project focuses on descriptions of cybernetic machines that interact with their environments. A major interest is in how children think about such machines, whether they see them as "creatures" or as "things." 2. Memory-Based Representation (Professor Kenneth Haase) We are developing an alternative account of representation where the structure of knowledge and cognition emerges from the connection of current descriptions to past situations and not from some a priori framework into which situations and experience are translated. Artificial Intelligence and Cognitive Science traditionally assume that one's representation (one's encoding of experience) determines the structure of memory; we are exploring models of memory where this determination goes in both directions. Descriptions are stored in memory by connecting them with descriptions already recorded and noting the residual differences unexplained by the connections made. In this way, what is stored in memory has a significant effect on how future descriptions are encoded and stored. 3. Understanding News (Professor Kenneth Haase) We are applying our memory-based representation systems to comprehending, filtering, and summarizing news stories. News stories taken from various wire services and other sources are run through a simple parser which annotates the text with phrase boundaries and possible relationships between phrases. This annotated text is then passed to the memory-based representation system and "understood" by identification of and connection with similar stories already in memory; preferences and queries are interpreted as partial stories which match incoming or recorded descriptions. Comparison of such understood texts with texts previously read by a user allows user- specific summarization of new articles based on the real differences between articles. In addition to filtering incoming daily news, these tools provide an interface to large text databases and other sorts of databases (e.g., images and video segments) annotated with textual descriptions. One strategic advantage of this approach is that in the worst case, it does as well as keyword matching - similar words indicate similar articles - yet in the best case it does as well as a human editor or selector. 4. Iconic Stream-Based Video Logging (Professor Kenneth Haase) Media Streams is an iconic logging system for video content which provides the descriptions used by storyteller systems, archival retrieval programs, content-based editors, and other systems which can take advantage of knowing the content of recorded video. The logger treats video as a stream with temporally bounded events rather than as a set of clips with attached keywords; this allows the system to automatically "cut" the video to its own purposes. Video annotations are represented graphically to enhance data visualization and to enable logs to be shared among human and machine users; in addition, palettes of commonly used sets of iconic annotations streamline the logging of segments similar to segments seen before. The indexing of both the video itself (whose images are stored digitally) and of the icon palettes connects to the facilities of a memory-based representation in the background. 5. Storyteller Systems (Professor Kenneth Haase and Professor Glorianna Davenport) Storyteller systems are sophisticated programs with deep and detailed knowledge of some particular domain or domains and access to "media resources" - recorded video, sound, and text - regarding the domain. By combining these resources with synthesized graphical and textual representations, a storyteller system produces a story customized to what it knows -and what it learns - of a listener's background, preferences, and interests. These stories emerge dynamically as the system interacts with the user; questions and criticisms yield wholly new sequences of video, sound, and explanation in reply. Such systems transform the character of publication: rather than producing epistles, one produces emissaries. '6. FRAMER: Knowledge Description and Sharing' (Professor Kenneth Haase) FRAMER is a portable library for knowledge representation and inference being used in a variety of projects around the Lab. FRAMER provides a persistent object-oriented database with a simple inheritance mechanism and an embedded extension language (FRAXL) based on SCHEME. FRAMER data structures are easily shared between different hardware platforms (workstations, Macintoshes, PCs) and software platforms (C and LISP). Current work on FRAMER includes the development of a portable user interface API for FRAXL, a networked implementation supporting the distribution of programs and data, and integration ongoing analogical representation work with FRAMER. FRAMER is currently being used in a number of projects throughout the Lab. 7. Graphics by Example (Henry Lieberman) Experts in visual domains such as graphic design are fluent in the generation and critique of visual examples. We are combining representation and learning techniques from artificial intelligence with interactive graphical editors to create a "programming by example" system to assist designers in automating graphical procedures. 8. Graphics for Software Visualization (Henry Lieberman) This project explores how modern computer graphic imagery can be used as a tool to help programmers visualize software. We are implementing a range of experimental debugging systems that use color, animated typography, and three- dimensional visual representation of programs. 9. The Berlin Wall of Programming (Henry Lieberman) The increasing demand for graphical workstations creates a schism between fast languages, such as C, and prototyping languages, such as LISP, in the UNIX environment. We are researching methods of overcoming this split in order to integrate AI with graphics in real time. 10. Intelligent Technical Documentation (Henry Lieberman) Technical documentation for hardware and software is expensive to produce, often inaccurate and inadequate. We are exploring a new approach to producing technical documentation in which an expert interacts with a simulation of a device, and the system automatically produces both English descriptions and visual illustrations. 11. Graphical Annotation (Henry Lieberman) People often communicate important knowledge by drawing and labeling diagrams. Why can't we communicate knowledge to a machine by using graphical indications of parts and structure rather than by textual databases or programming languages? We are using computer-readable graphical annotation of images in a direct- manipulation editor to communicate relations that tell the system how to interpret and generalize user actions. We are also exploring voice input so that the user can explain actions to the machine as they are being performed. 12. Instructible Agents (Henry Lieberman) Agent software can perform tasks automatically on behalf of a user, but how does the agent come to learn what the user wants? Sometimes the agent can learn just by observing user behavior, but there may also need to be interaction where the user instructs the agent more explicitly. The instructibility aspect is the focus of this project. The user may present examples of behavior that the agent should follow and give advice to the agent as to how the examples should be interpreted. The agent must give feedback to the user so that the user understands what the agent knows and is capable of doing. Multimodal interaction is important in both the instruction and feedback. 13. Agent-Application Communication (Henry Lieberman) Current experiments in agent software rely mostly on domain-specific applications that have been programmed from scratch or explicitly modified in mind. Is it possible to make a toolkit or protocol that would allow an agent to communicate and control applications that have been constructed more conventionally? Can the agent "take the place" of the user in the interface? Can the agent have access to the application's data and behavior? Will commercial "inter- application communication" mechanisms suffice? What is the division of labor between the agent and the application? 14. Autonomous Agents (Professor Pattie Maes) This project applies artificial intelligence techniques to the field of human-computer interaction. In particular, techniques and systems developed in the area of autonomous agents and the area of commonsense representation are combined to implement "interface agents": interfaces that provide expert assistance to a person engaged in the use of a particular computer application. Interface agents differ from current day interfaces in that they are more autonomous (performing many of the time-consuming, more mundane tasks the user normally would have to perform), more intelligent (learning from the user by observation and querying), and more personalized (customizing according to the user's goals, needs, preferences, habits, and history of interaction with the system). The project focuses on how interface agents can acquire their competence using machine-learning techniques. 15. Interface Agents (Professor Pattie Maes) This project applies artificial intelligence techniques to the field of human-computer interaction. In particular, techniques and systems developed in the area of autonomous agents and the area of commonsense representation are combined to implement "interface agents": interfaces that provide expert assistance to a person engaged in the use of a particular computer application. Interface agents differ from current day interfaces in that they are more autonomous (performing many of the time-consuming, more mundane tasks, the user normally would have to perform), more intelligent (learning from the user by observation and querying) and more personalized (customizing according to the user's goals, needs, preferences, habits, and history of interaction with the system). 16. Editors, Agents, and Butlers (Professor Pattie Maes) This project attempts to deal with the problem of news information overload. We are building "interface agents" for news filtering. These are semi- intelligent computer systems that make personalized suggestions to a user for news items (text, video, audio). The user is able to browse through the news available (as is the case with current interfaces), but some of the news items will have been "highlighted" while other items might have been left out by the agents. These agents learn news items in which the user might be interested in three different ways. First, the user is able at all times to instruct an agent about which news items the user wants to receive or not receive. Second, the user is given the option of providing feedback to the agent about how much certain news items are liked or disliked. These feedback data are used by the agent to discover regularities in the user's news interests in terms of the content of the article, as well as other features such as the author, urgency, and news source. Third, these feedback data are used to detect similarities between different users and to discover "clusters" of users with similar news interests (on a given news topic). Once such clusters have been detected, news items that one or more users liked are suggested by the agent to a user with similar interests. 17. Society of Mind (Professor Marvin Minsky) Professor Minsky continues to develop the theory of human thinking and learning called the "Society of Mind." This theory explores how phenomena of mind emerge from the interaction of many disparate agencies, each mindless by itself. For example, one aspect of the theory explains reasoning by analogy on the basis of transforming between different kinds of knowledge representations. Another aspect is a "re-duplication" account of natural language, in which grammatical forms are seen as emerging directly from expressive requirements of communication between different mechanisms inside the brain, rather than from conventions that communications between people are forced to fit. Professor Minsky has a continuing interest in the limits and potentials of "connectionist learning systems" and their role in distributed cognitive accounts like the Society of Mind. He is actively considering how such systems may be combined and interconnected in a way that avoids the serious scaling problems of unstructured connectionist systems. 18. Animal Construction Kits (Professor Marvin Minsky) This is a project whose context is the simulation of animal behavior, with goals of developing computational models for ethology, investigating situated action approaches to artificial intelligence. A related goal is the development of environments for facilitating such projects. 19. Structure out of Sound (Professor Marvin Minsky, Andrew Lippman, and Michael Hawley) In an information-rich environment where data, images, and sound are readily accessible and digitally communicated, the issue of content- based search becomes a necessity. Structure out of Sound is the first attempt at a unified analysis tool for speech, music, and sound effects. Movies are analyzed into sonic primitives that allow one to divide a movie into dialogue and action or to identify the presence of a single actor. The initial work, a doctoral thesis, lays out the groundwork for later addition of visual browsing and correlating elements. 20. Constructionism (Professor Seymour Papert, Professor Edith Ackermann, and Professor Mitchel Resnick) We are developing "constructionism" as a theory of learning and education. Constructionism is based on two different senses of "construction." It is grounded in the idea that people learn by actively constructing new knowledge, not by having information "poured" into their heads. Moreover, constructionism asserts that people learn with particular effectiveness when they are engaged in "constructing" personally meaningful things (such as stories, animations, or robots). 21. Robot Design Competitions (Professor Seymour Papert and Professor Mitchel Resnick) We have helped develop an intensive, one-month robot design course for MIT undergraduates. In the course, students design and build robots made from electronic and LEGO parts, then pit the robots against one another in elimination-style competition. The Robot Design Competition is a living laboratory for the constructionist theory of learning, and a vehicle for exploring the role of design activities in education. In the future, we plan to organize similar activities for precollege students, using our new "Programmable Brick" technology. 22. Project Headlight (Professor Seymour Papert) Eight years ago, we began a partnership with the Hennigan School, a multicultural public elementary school in Boston. At the school, we have helped develop a technology-rich environment, with more than 100 personal computers for 200 students. We have worked with teachers and students to explore new approaches to education and new uses of technology in education. 23. Learning in Multicultural Settings (Professor Seymour Papert and Professor Edith Ackermann) For several years, we have focused on issues related to gender, race, culture, and cognitive styles. One setting for this research is Paige Academy, a small, independent Afrocentric school in the Roxbury section of Boston. This setting provides an organizationally and culturally different context for the development of new ideas about learning. 24. Science and Whole Learning Teachers' Collaborative (Professor Seymour Papert) We maintain a working relationship with a network of teachers from different schools (mostly in the Boston area, but also some in other parts of the country). Through this network, we have collaborated with teachers in developing concepts for workshops, seminars, and other activities to foster their professional development. 25. Electronic Communication (Professor Seymour Papert and Professor Mitchel Resnick) We maintain a telecommunications network through which collaborating teachers and schools can maintain contact with the group and with one another. Elementary-school students also use the network. In one project, bilingual students in Boston are communicating with students in Costa Rica. 26. Children as Designers (Professor Seymour Papert and Professor Edith Ackermann) We are studying how children can change from "consumers" into "designers" of computer-based multimedia productions. In one project, elementary-school students are designing their own computer games - and, in the process, learning about programming, mathematics, collaboration, and design. The project is an extension of earlier research in which children designed instructional software to help other students learn about fractions. 27. Games (Professor Seymour Papert and Professor Mitchel Resnick) The idea of playful learning is pervasive in all of our activities. Specific game-oriented research include studying children's attachment to video games, studying the informal learning process through which children master new games, and studying children as designers and implementers of their own games. 28. Study of Mathematical Thinking (Professor Seymour Papert) The theme of studying mathematical thinking pervades many projects. A specific project in this category is a study of probabilistic thinking in children and adults. 29. Thinking and Learning about Systems (Professor Mitchel Resnick) We are studying how students think about "systems concepts" (such as feedback, self-organization, and evolution), and how to make these ideas more accessible to young children. As part of this effort, we have developed an extended version of Logo with thousands of interacting graphic turtles, which students can use to explore ideas about self- organizing and decentralized systems (such as ant colonies and traffic jams). 30. Ubiquitous Computing for Kids (Professor Mitchel Resnick) We are extending the notion of the child's construction kit, adding computational elements to the bin of building parts, so that children can embed computational power in the machines they build, and spread computation throughout their world. This idea is part of a more general movement toward "ubiquitous computing" - the incorporation of computational elements into the everyday objects. As part of this effort, we are developing a "Programmable Brick" - a LEGO brick (the size of a deck of cards) with a computer inside. 31. New Visions of Programming in Education (Professor Mitchel Resnick) We are introducing new "programming paradigms" into educational computing - for example, adding multiprocessing capabilities to the Logo programming language. These new paradigms not only extend the types of projects that children can work on (for example, making it much easier for children to create their own video games), they also help children develop new ways of thinking about certain mathematical and scientific concepts. 32. Learning in Virtual Communities (Professor Mitchel Resnick) Imagine students from many different schools, each connected (via the Internet) to the same "virtual world." Students can "walk" around the world, and meet and talk with other students. Perhaps one "room" in the world is dedicated to discussions about environmental issues. The world is also extensible: students can create and program new "objects" and new "rooms." We are creating such on-line worlds (known generically as "MUDs") as a context for students to become meaningfully engaged in reading, writing, and programming. II. PERCEPTUAL COMPUTING 33. Mid-Level Vision (Professor Edward Adelson) We are developing early and mid-level vision mechanisms that emulate the processing that occurs in primate visual cortex and are designing algorithms that apply them with high computational efficiency. The mechanisms are useful for edge detection, texture analysis, motion analysis, and image enhancement. 34. X-Y-T Image Analysis (Professor Edward Adelson and Professor Aaron Bobick) We treat a sequence of images as a three-dimensional volume, with the dimensions of x, y, and t (time). Motion analysis involves orientation- selective filtering within this volume. We are developing techniques for dealing with difficult situations such as motion occlusion and motion transparency. 35. Analysis of Egomotion Using Wide Angle Vision (Professor Aaron Bobick) A critical problem in computer vision is determining the motion of the camera through a scene (egomotion). We are developing techniques for using stereo, wide-angle imagery data to give a better egomotion estimate than monocular sequences of images, and in a way that is much simpler than previous approaches. 36. Modeling and Tracking People (Professor Aaron Bobick) The ability to track people in imagery, and determine their positions and pose, is critical for many machine interface and telecommunications technologies. The goal of this research is to use generic models of people along with known information about the environment to maintain an accurate geometric model of the people. Doing this requires intelligent reasoning about multiple views and occlusion. 37. Dynamic Scene Annotation (Professor Aaron Bobick) In a dynamic scene, what is in the image is less important than what is happening in the scene. We are developing dynamic description mechanisms capable of extracting the important aspects of the behavior or motion present in a scene. Two domains we are exploring are charting football plays and extracting choreography from a ballet sequence. 38. Multimodal Natural Dialog (Dr. Richard A. Bolt) People in each other's presence communicate via speech, gesture, and gaze. The aim of this research is to make it possible for people to communicate with computers in essentially the same way. This research explores combined speech, free-hand manual gesture, and gaze as input modes to the computer. One side of this effort is adapting technologies to capture inputs from the user: a speech recognizer, gesture-sensing gloves, and a head- mounted eye-tracking system. These technologies are off-the-shelf, and as more efficient, less obtrusive technologies emerge they will be assimilated into the work. The other side of the effort involves the creation and elaboration of the software intelligence to interpret input from speech, hands, and eyes, and to map to an appropriate response in graphics and speech or nonspeech sound. The main expected outcome from this research is that computer-naive people (read: most of the world) will be able to use everyday social and linguistic skills to access computers and computer-based media. 39. Advanced Interactive Mapping Displays (Dr. Richard A. Bolt, Professor Muriel R. Cooper, and Ronald MacNeil) This topic represents a three-year project involving: *Development of graphically intelligent tools and principles to support the interactive creation of symbolic information landscapes. *Integration of such landscapes with pictorially convincing virtual environments. *Enabling of multimodal natural language communication with the virtual environment display and its contents via combinations of speech, manual gesture, and gaze. These three streams of investigation are to converge in year three of the overall project in the context of an ultra-high-definition, seamlessly tiled wall-sized display (DataWall). 40. Information Appliances (Michael Hawley) Tools and appliances of all sorts, from wristwatches and notebooks to concert grand pianos and home entertainment systems, are sprouting digital components. To interoperate harmoniously, and to ease the personal interface to a global information system, appliances need to communicate with each other. This project studies the languages and systems required for an open and scalable architecture. 41. Structure out of Sound (Michael Hawley, Professor Marvin Minsky, and Andrew Lippman) In an information-rich environment where data, images, and sound are readily accessible and digitally communicated, the issue of content- based search becomes a necessity. Structure out of Sound is the first attempt at a unified analysis tool for speech, music, and sound effects. Movies are analyzed into sonic primitives that allow one to divide a movie into dialogue and action or to identify the presence of a single actor. The initial work, a doctoral thesis, lays out the groundwork for later addition of visual browsing and correlating elements. 42. Looking at People (Professor Alex Pentland) This large, multiyear research project called "Looking at People" is composed of several different subprojects including real-time tracking people's body positions as they point and move about in the work environment, gesture and expression recognition, and continued development of our real-time face recognition system. Currently there are two "test bed" applications of this technology: a real-time virtual reality system called ALIVE (with Professor Pattie Maes) and a "smart" teleconferencing system. 43. Model-Based Image Coding (Professor Alex Pentland) This research project is developing generic, physically based models that allow ultra-low bandwidth image compression. Using such models we can concisely describe an object's appearance, and predict how its appearance will change as the object and camera move. Using these techniques we have been able to achieve high-quality still-image compression with 50:1 to 100:1 compression ratios, and high-quality video compression at only 8 kilobits/second. '44. Video Databases: Indexing by Content' (Professor Alex Pentland) One of the most significant problems with multimedia technology is that you can't find what you want. This is because, unlike text-only systems, you can't ask a computer about the contents of images or video. For instance, you can't ask the computer to "find another video clip like this one, but shot from another angle," or "find a video clip of me on the beach." We are working to solve these problems by making computers able to "see" the contents of images and video. 45. Image Query by Texture Content (Professor Rosalind W. Picard) People can quickly scan a lot of pictures and identify a particular pattern in a still image or video sequence. Machines currently cannot. We are studying how humans recognize visual patterns, and we are building computer models to mimic this behavior. Particular attention is given to how humans classify patterns and interpret directionality, contrast, periodicity, randomness, translation, rotation, perspective, and scale. 46. Nonlinear Space-Time Texture Models (Professor Rosalind W. Picard) A bicyclist's pedaling may be identified as a periodic texture in time. Ravaging flames or turbulent water can each be thought of as a stochastic texture in space and time. We are developing nonlinear models for spatio-temporal patterns that don't adhere to the "rigid body, affine motion" assumption. Models currently under exploration include physical models of turbulence and biologically motivated reaction-diffusion systems. We have also been developing general methods for nonlinear optimization; these have many applications such as recognition of nonlinear patterns. 47. Semantic Image Modeling (Professor Rosalind W. Picard) If I state "Atlanta is in Cincinnati" today, it is unlikely you will think I am coherent. If, however, we are talking baseball, then the sentence is very clear. The context makes the interpretation not only easier, but possible. Similarly, with pictures, if you see blue at the top then it's probably sky. The goal of this work is to begin setting up two-way interaction between available contextual information and the models used to represent visual information. The ultimate goal is the one Shannon missed - putting semantic meaning into "information" theory. 48. Computers and Telephony (Christopher M. Schmandt) Computer workstations can provide a much needed user interface to advanced telephony functions, provided a path exists between the workstation and switch. Controlling call set-up from a user's workstation allows a greater degree of personalization and dynamic call handling, both for outgoing and incoming calls. This project is being implemented in the ISDN environment of MIT's campus telephone network, using Phoneserver, a computer network interface to Basic Rate ISDN switching. 49. Desktop Audio (Christopher M. Schmandt) This project explores software architectures and user interfaces to voice as a computer data type as well as a command channel. Its goal is to make speech ubiquitous to a range of applications, for instance, editing a telephone message to include annotation of a text document. Related issues include object-oriented manipulation of multiple media "selection" (or "clipboard") data between processes. 50. Voice Interfaces to Hand-Held Computers (Christopher M. Schmandt) This project is using a mock-up to explore user interfaces and applications of voice in a hand-held computer. The target is a machine, the size of a microcassette recorder, which is simply a mobile extension of a more powerful desktop computer. Applications include note-taking, outlining, and a memory assistant. 51. Voice Hypermedia (Christopher M. Schmandt) The project takes the traditional "hypertext" approach to a voice-only environment. Text is replace by recorded voice segments, and the user interface consists of a speech recognizer and speech synthesizer. A related issue is automatic segmentation of recorded speech segments into semantically meaningful chunks. 52. Telephone-Based Voice Services (Christopher M. Schmandt) This project explores the utility of voice in a range of applications offering services to users of the telephone network. Topics being examined include voice mail, speech synthesis of electronic mail, access to calendars and rolodexes, and speech- based user interface to call processing features such as variable call forwarding. Visual (on the workstation) and speech (over the telephone) based applications offer differing views of the same underlying databases in an office environment. 53. Synthetic Performers (Professor Barry Vercoe) We have shown that computers can exhibit real-time musical behavior similar to that of skilled human performers. Our live violinist accompanied by a computer-driven piano has been widely viewed on public TV. This research continues to explore the music-cognitive issues that arise when a computer is put in the position of real-time, highly sensitive human interaction. 54. Synthetic Listeners (Professor Barry Vercoe) This project is researching audio signal separation, with a focus on polyphonic pitch detection. We want to understand how humans do multisource audio separation with ease (the "cocktail party conversation" trick), and why machines cannot. We are developing a representation of sound using recent concepts of human auditory encoding, so that machines might perceive complex audio signals the way humans do. 55. Synthetic Spaces (Professor Barry Vercoe) Research is being conducted on electronic enhancement of a room's natural ambience via an active boundary system of microphones and speakers. The technique utilizes a new class of flat reverberators running on a high-speed digital audio processor. Our goal is to separate acoustics from architecture within rooms and public spaces. 56. Cognitive Audio Processing (Professor Barry Vercoe) This project is investigating how humans perceive and quantify music and audio information in cultural contexts. This involves computer- assisted understanding of source identification, voice intonation, rhythmic and tonal structure, and emotional content, within both Western and non-Western traditions. 57. Structured Audio Transmission (Professor Barry Vercoe) We are researching the flexible encoding of speech, music, and ambience (partially rendered), suitable for rate-varying packet transmission over a multiplexed audio/video channel. We are also studying receiver decoding, channel assignment and rendering, according to the level of local resources, which are also self-calibrating and adaptive. III. INFORMATION & ENTERTAINMENT 58. Salient Stills (Walter Bender) A Salient Still is a 1500-line, print- quality photograph created from a video sequence. It can carry both the context and the detailed content of the sequence. The data representation consists of video pans, tilts, and zooms warped into a continuous space/time volume. A high-resolution, panoramic still image is extracted from this representation. This still image has both the wide field of view captured by the short focal-length frames and the detail captured by the long focal-length frames. 59. Color Semantics (Walter Bender) We are exploring the role of color alignment in the preservation of the experience of color. Central to this investigation is the formulation of color alignment and its measurement. Objective quantification of color relatedness is desirable, since it allows precise specification of color in relations to its surrounding visual context and state of visual adaptation. A secondary theme of this research is the role color alignment plays in the generation of expressive energy in color combinations. Expressive load of color combinations can be predicted, based on selection of color alignments. We are applying this work to the measure of degree-of- alignment between window and background in a workstation. This work will provide guidelines for effective selection of window, font, and background colors for any given application. 60. Knowing the Individual (Walter Bender) Just as a display should "know" the data, it should also be cognizant of the user. The more the system knows about the user, the better able it will be to make sense of the ambiguities and inconsistencies inherent in human communication. Our work in user modeling involves the full exploitation of the user's computational environment, so that information normally provided by the computer (e.g., idle time, schedule information, electronic mail subscriptions) and other, more esoteric information (e.g., physical location tracking systems, eye- tracking systems, speech manipulation, electronic newspapers, model-building cameras) can be integrated to construct dynamic, individual user models that both change over time as users change and as the system learns more about users. 61. Interactive Computation of Holographic Images (Professor Stephen A. Benton) The display of holographic 3-D images requires many megabytes of data to be recomputed every time the image is changed. These calculations simulate the propagation and interference of light beams, but numerical shortcuts and other new techniques have reduced the computation times by more than twenty times to well under one second, allowing truly interactive manipulation and exploration of complex 3-D image data. 62. Scaled-Up Holographic Video (Professor Stephen A. Benton) The world's first electronic holographic video display has established the principles of information reduction and image scanning, but scaling up to practical display sizes has posed significant electronic and electro-optical challenges. The parallelization of the computation, storage, and display has been shown feasible for 3" x 5" images, laying the groundwork for further scale-ups of image size. 63. Holographic Laser Printer (Professor Stephen A. Benton and Michael A. Klug) Full-color, wide-angle, and large-size computer-generated hard-copy holograms still take considerable time to create. A "holographic laser printer" allows simpler hard-copy holograms to be generated in minutes instead of hours, automatically and without wet processing. Research topics include recording materials and processing, optical design, image processing and LCD display, and optical techniques for image noise reduction. 64. Immersive Projected-Image Holographic Displays (Professor Stephen A. Benton) The creation of meter-sized holographic 3-D images can be achieved with large-area holograms, or via the projection of images from smaller holograms into wraparound optical systems. Here we explore the distortions and properties of deeply concave mirrors used as projection elements. 65. Medical Image Holography (Professor Stephen A. Benton) MRI and CAT-scan cameras gather three- dimensional data, but holography offers the only way of examining those images in fully three-dimensional form. This project explores new image- processing, editing, and rendering tools that are needed to make these complex 3-D images quickly and accurately interpretable by physicians. 66. Edge-Lit Holograms (Professor Stephen A. Benton) Conventional holograms require illuminators to be mounted on walls or ceilings near the hologram; edge-lit holograms are a new type of white- light hologram that allow the light source to be included within the mount itself, assuring a compact and carefully aligned illumination. This project explores the fundamental diffraction and imaging properties of these holograms with a view toward making their images deeper, brighter, and clearer. 67. Open Architecture Television (Professor V. Michael Bove) Open Architecture Television explores the encoding of digital video in such a way that the parameters of production (resolution, frame rate) may be decoupled from those of the display, supporting a broad variety of production and display systems and permitting easy international interchange as well as interworking between television and computer equipment. We have successfully demonstrated this idea using spatiotemporal subband coding, and also have developed frame-rate decoupling methods appropriate for motion-compensated coders such as MPEG. '68. Cheops: Data-Flow Television Receiver' (Professor V. Michael Bove) The Cheops Imaging System is a compact, modular platform for acquisition, real-time processing, and display of digital video sequences and model-based representations of moving scenes. It is intended as both a laboratory tool and a prototype hardware and software architecture for future programmable video decoders. Rather than using a large number of general-purpose processors and dividing up image processing tasks spatially, Cheops abstracts out a set of basic, computationally intensive stream operations that may be performed in parallel and embodies them in specialized hardware. Eight systems have been built and are in use at the Media Lab and at various sponsor sites. 69. Motion Modeling for Video Coding (Professor V. Michael Bove) Most digital video-coding methods use a very simple approximation to scene motion that breaks up images into arrays of square tiles and assigns a two-dimensional motion vector to each. We are developing video-coding methods that segment scenes into coherently moving regions and compute more accurate motions for the regions. The result should be a more compact representation, better scene understanding, and the ability to compute images for arbitrary instants in time (in connection with Open Architecture Television research). 70. Production, Distribution, and Viewing of Structured Video Narratives (Professor V. Michael Bove and Professor Glorianna Davenport) Research in video coding at the Media Lab increasingly emphasizes structure as a means of leveraging both compression and story. Image understanding, machine vision, and a priori knowledge are used to produce video representations in terms of component parts (actors, backgrounds, moving objects) and to produce content annotations for story construction. This form of coding has implications for production, postproduction, distribution, and viewing. The goal of this project is to script, produce, and work with a story represented as a structured video database in order to examine diverse issues including script annotation and storyboarding, camera design, production techniques, data formatting, and viewing paradigms. 71. Multimedia Testbed (Professor Muriel R. Cooper, Ronald MacNeil, and David Small) The Meta-Media project integrates a rich set of graphic tools and editors with searching, browsing, linking, scripting, and visualization capabilities to allow research into the new design issues emerging from real-time, multilayered information in an electronic communication environment. The planning of structured and unstructured informational multimedia pathways presents graphical design complexity and challenge for both the designer and the user of multimedia information. Traditional media designers from the print, audiovisual, and animation worlds provide important insights into guiding viewers' perceptual responses to information. Work that bridges the gap between the hands-on world of designers and the more abstract symbolic world of programming explores spatial, temporal, and relational rules and methods which rank information for the viewer, influence emotional responses, and often embody hidden aesthetics. Automatic layout and design intelligence will be required to filter data for users in every field. Work is done in a sophisticated hardware and software environment which includes our own window manager. 72. Computationally Expressive Tools (Professor Muriel R. Cooper, Ronald MacNeil, and David Small) We are developing a repertoire of graphics that will allow computational assistance in the expression of dynamic and interactive design. In an electronic information environment we need new graphical principles, tools, and editors which are suitable to the integrated, interactive, dynamic, and intelligent formation and presentation of information. This graphical set must be integrated with real-time design-assistance systems in order to cope with the magnitude of visual complexity resulting from multiple streams and forms of data that deluge the user. *Computational Graphics: Animation is currently produced either by labor intensive cel animation, based on expressive individual creativity, or by traditional computer graphics animation based on modeling of physical behavior. While work in the direction of coupling knowledge-based animation is very young, we are exploring ways of modeling and animating data information as a set of interactive tools - data as graphics/ behavior of information. *Data-Driven Graphics: Data visualization is the symbolic counterpart of scientific visualization in which we will build transpositional models that will allow various forms of on-the-fly abstractions from real-time data domains such as maps, weather, and actuarial information. *Behavioral Graphics: Information that responds dynamically and interactively to change based on physical models drawn from work in scientific visualization holds great promise. Our work in responsive substrata that allow the user to model paper fibers, pigment, diffusion, and gravity will be extended into informational models that, for example, would graphically indicate age or accuracy of data. Further modeling of mark-making tools and force feedback is planned. *Animation: A cel-based animation system with many unique capabilities is the foundation for further animation research. The integration of hand-drawn animation with 3-D modeling continues to be a research subject. Work in moving back and forth from 2-D to 3-D continues, as do investigations into simple forms of automation. *Sound-Graphics: This project explores some of the unique and overlapping characteristics of image and sound. In Tone of Voice Typography, the color, size, translucency, style, and even meaning of a word may be driven by the pitch of a sound over time. Recent work includes sound at the interface, sound/graphic objects, spatial sound, and compositional and analytical tools *Adaptive Typography and Graphics: This project is developing ways of filtering typography and graphics on the fly for greater legibility and maintaining the perception of consistent color in an unpredictable, changing environment. These principles are being incorporated with dynamics and intelligence, and extended to include more complex graphics. *Topographical Typography: The goal of this project is to develop dynamic maps, typography, and graphics which have knowledge of each other, and to develop intelligent tools that allow the effective design of graphical behavior in relation to real-time dynamic data. *Visual Complexity and Selective Filtering: Using gaussian filters and pyramid coding, translucency, blur, and multiple layers of landsat and weather data, we are able to selectively address aspects of complex information in real time for task- based information. Future work includes making object-based elements, local changes, zooming, two and one- half and three-dimensional views, and transitional changes. *Configurable Interface Design: Ways of interacting with these systems graphically require new paradigms beyond the desktop and window metaphors. Integration of expressive tools and graphical intelligence in a multimedia environment will enhance current work in graphical interfaces that can adapt to task specifics and personal preferences. *Browsing and Navigation: Traversing and navigating complex information effectively requires new graphical models which allow the user to maintain context while exploring multiple levels of information simultaneously. The infinite zoom will allow us to do nodal zooming while maintaining graphical context in very large informational databases. 73. Large-Scale, High-Resolution Display Prototypes (Professor Muriel R. Cooper, Ronald MacNeil, and David Small) Our prototype of a 2,000 by 6,000 line display provides us with a testbed for investigating the integration of graphical presentation and intelligence in interactive and dynamic form. Integration of many of our multimedia capabilities is underway. The prototype is connected to the Connection Machine and will soon be connected to a fiber-optic cable which will allow us to explore collaborative and remote communication and the implications of space on information creation and management. A prototype for an 8 x 10 flat panel display is planned. 74. Input/Output Considerations (Professor Muriel R. Cooper, Ronald MacNeil, and David Small) Hardcopy output will continue to play a major role in the information medium, and we will need intelligent layout systems to transcode work areas and sessions into appropriate layout on paper. Work has just begun on this aspect of the research. 75. Advanced Interactive Mapping Displays (Professor Muriel R. Cooper, Dr. Richard A. Bolt, and Ronald MacNeil) This topic represents a three-year project involving: *Development of graphically intelligent tools and principles to support the interactive creation of symbolic information landscapes. *Integration of such landscapes with pictorially convincing virtual environments. *Enabling of multimodal natural language communication with the virtual environment display and its contents via combinations of speech, manual gesture, and gaze. These three streams of investigation are to converge in year three of the overall project in the context of an ultra-high-definition, seamlessly tiled wall-sized display (DataWall). 76. Experiments in Elastic Media (Professor Glorianna Davenport) We define "Elastic Media" to be a user- directed form of media storytelling in which the computer mediates between the user and chunks of content. Content prototypes are developed to demonstrate relationships between content, form, modes of interaction, and computational substructures. Issues include research and production of content segments and meaningful machine-based orchestration of these segments, based on user input. Current projects include: *Elastic Boston 2: A new content-based project which focuses on the intersection of a documentary style guide to an urban venue and a shared communication network. The project will focus on downtown Boston, an area from the Causeway to South Station, including Faneuil Hall and the North End. The system will offer personalized local news, in-depth reporting, community portraits, and advertising. The application will invite localized, shared exchanges concerning their impressions, memories, and activities between community members. *Movie-Maze: A virtual world has been created for browsing movie trailers. The world can be thought of as a 3-D graphical mud in which users can communicate with each other while they are exploring the world and the movies it contains. *New Orleans Interactive (HyperCard implementation): This project explores structural issues related to the design of complex documentary narratives for education. *Video Postcards: These are a semi- structured form of personal communication. Electronic postcard formats should support inclusion of low bandwidth movies suitable for tele- network transmission. *Wheel of Life: This project represents multimedia which has escaped the bounds of the box; this project raises interesting issues about interactive spaces and collaborative discovery. This research is particularly relevant for museum exhibit design, theme parks, and electronic performance spaces. '77. Video Editing: Computational Partnerships' (Professor Glorianna Davenport) Movie editing is extremely time- consuming, so time-consuming, in fact, that few home movies are ever edited. The connection between video as information and video as story will become increasingly critical as digital transmission of video from remote visual databases becomes viable. The goal of this work is to integrate the moviemaker's knowledge of content and craft into software in order to model more robust human- machine partnerships for video storytelling. Systems include logging, sequencing, and editing modules. *Stratification: This research in video description incorporates our understanding of how the camera mediates the environment while recording content. The logging environment is stream-based. The browsing interface emphasizes scalability of description hierarchy and a graphical continuum. Both the annotation and sequencing tools are linked to Framer to allow maximum interchange between machine-based annotation algorithms, human annotation, and storytelling structures. The interface will be expanded to include the creation and use of low-level and high-level relationships found within the content. *Log Boy and Filter Girl: This work focuses on programmatic storytelling. The system encourages the filmmaker to think about the multiple playouts of a story during script development. After describing the story purpose and defining the character set, the filmmaker defines axes of interaction which will be allowed in the story playout. These axes serve as an organizing metaphor for script expansion. The logging is defined as a function of filtering and vice versa. The logging process is dynamic, graphical, and attribute-oriented. A series of predefined filters can be expanded by the user, based on particular needs. Filters reflect axes of interaction. Several filters are generally cascaded to offer maximum flexibility in shot selection. *Video Streamer and Collage: This is a two and one-half and dimensional paradigm for browsing which includes object-like graphical selection of video from any source. The interface focuses on multiple views of the video and audio stream, including the edge of the frame, the frame in relation to other frames, and audio associated with a given frame. The stream is parsed algorithmically for shot boundaries. Movie clips can be selected and manipulated in a collage space. 78. Stories with a Sense of Themselves (Professor Glorianna Davenport) Current research into multithreaded stories and storytelling tools beg the issue of the author with a deep sense of commitment to the story which is being told. This project seeks to explore the relationship between personalization of the story for the viewer and tools which specify the author's concepts and constructs. *Digitally Orchestrated Micromovies: For many applications, story filters designed by the author will allow a viewer to drive through a database of micromovies. The filters can include simple content relationships and stylistic features. The method is illustrated with several prototype movies, including "Endless Conversation," "Dial a News Summary," and "This Ad is for YOU." *Multithreaded Narratives: This project is a theoretical and practical exploration into narrative structures. *Semantic News Network: This project is a look at how information services might be structured to accommodate thoughtful interactions. '79. Directing Digital Video: New Tools' (Professor Glorianna Davenport) As digital video comes into its own, directors will need new tools to preview and construct story elements for multithreaded, interactive scenarios. *The Director's Eyeglass: This project is a portable prototype which will allow a director to preview digital effects in the field. *Coding Camera Motion and Field of View: This project looks at the mechanics for recording and using information about the camera view to link content segments. *The Journalist's Conceptual Notepad: This project looks at how the journalist can create a rich, machine- readable conceptual framework during development of story concepts. The project will encourage the preservation of the journalists framework during the reconstruction of a story by personalizing agents. 80. Storyteller Systems (Professor Glorianna Davenport and Professor Kenneth Haase) Storyteller systems are sophisticated programs with deep and detailed knowledge of some particular domain or domains and access to "media resources" - recorded video, sound, and text - regarding the domain. By combining these resources with synthesized graphical and textual representations, a storyteller system produces a story customized to what it knows -and what it learns - of a listener's background, preferences, and interests. These stories emerge dynamically as the system interacts with the user; questions and criticisms yield wholly new sequences of video, sound, and explanation in reply. Such systems transform the character of publication: rather than producing epistles, one produces emissaries. 81. Production, Distribution, and Viewing of Structured Video Narratives (Professor Glorianna Davenport and Professor V. Michael Bove) Research in video coding at the Media Lab increasingly emphasizes structure as a means of leveraging both compression and story. Image understanding, machine vision, and a priori knowledge are used to produce video representations in terms of component parts (actors, backgrounds, moving objects) and to produce content annotations for story construction. This form of coding has implications for production, postproduction, distribution, and viewing. The goal of this project is to script, produce, and work with a story represented as a structured video database in order to examine diverse issues including script annotation and storyboarding, camera design, production techniques, data formatting, and viewing paradigms. 82. Real-Time Modeling (Professor Neil Gershenfeld) As routinely accessible computers begin to approach gigaflop speeds and as data networks approach gigabit/second bandwidths, it becomes possible to interact in real time with meaningful numerical models. We are exploring this promise in the context of musical instruments, both because of its significance for their evolution and because they provide an extremely demanding environment that requires the integration of multiple degrees of freedom of real-time I/O with state-of-the-art computational processing. We will be doing experiments to characterize the physics of successful traditional instrument designs, using these experiments to guide the creation of numerical representations (based on both first-principles physical models and on nonlinear time-series analysis), and developing new approaches to interface a player to these models. The initial goal is to capture the instrument's performance from the perspective of the player (i.e., pass a musical Turing test), and the longer term goal is to move beyond these traditional designs while still maintaining their mature richness and subtlety. It is anticipated that the tools that are developed for this will be applicable to more general human-machine interaction problems. 83. Interface Sensors and Transducers (Professor Neil Gershenfeld) Technological interfaces must sense user activity on a wide range of length scales, ranging from less than a millimeter (stylus input), through centimeters (gesture sensing) and meters (local tracking), to kilometers (navigation). Increasingly, these measurements must be done in three dimensions, must produce images as well as measurements, and must maintain the required spatial and temporal resolution without significantly encumbering the user. Force must often be measured along with position, and it may be desirable to generate output force (tactile feedback). Unfortunately, the poor state of the available sensing and transduction technology for these problems has been a significant constraint on the development of many new applications. We are using a range of experimental techniques to develop the instrumentation for the environment around information processing systems. This includes designing and applying new materials, the use of lensless imaging, and the active remote interrogation of passive sensors. 84. Information, Computation, and Physics (Professor Neil Gershenfeld) Information, as logical content, necessarily has a physical reality. Although these two levels of description are usually entirely distinct (the designer of a conventional memory circuit does not need to know what messages it will store), there are exciting possibilities and increasingly serious constraints associated with their interface in devices that store, transmit, and manipulate information. We are exploring this area in both directions: using physical insights to help solve engineering problems (such as the use of dissipative dynamic systems to satisfy communication channel constraints) and using engineering insights to help understand physical systems (such as applying ideas from information theory to help understand complex physical systems). A central theme is the relationship between logical and physical entropy; here we are studying the use of active devices to bypass conventional thermodynamic limits in logic. 85. Incremental Coding (Andrew B. Lippman) High-quality compression is inherently asymmetric - robust source processing directly yields more efficient image representations. Expecting that the original material may be available only once, this research is directed at creating a compressed, intermediate format that can be translated into consumer distribution formats for any rate from 1.5 megabits/second to studio quality, using hardware no more complex than a home decoder. A corollary is real-time encoding that is later asymmetrically processed (in the background) to reduce the immediately available digital workprint to a distribution format. 86. Movies via Modems (Andrew B. Lippman) Ultralow bandwidth coding divides a scene into background and dynamic elements (objects) that can composite any individual frame. An example is telephonic movies where a library of essential scene elements is distributed in advance, but the cues needed to assemble them into a movie is sent at viewing time over normal telephone lines. Alternatively, one could store more than one episode of a series on a single compact disc. This "book of the month" movie system allows periodic distribution of the core parts of many movies on one compact disc (or by downloading), combined with real-time telephone delivery of assembly rules. 87. Objective Coding (Andrew B. Lippman) Objective Coding generalizes early work on Scene Widening (1992) to analyze a picture sequence into components separated by their activity. The goal is similar to book- of-the-month movies, but the concentration is on scene analysis. Objective coding uses panoramic storage and compositing to construct each frame of a sequence by warping and shifting elements stored in memory. In the current work, the basic architectural elements of an MPEG decoder is reconfigured so that its internal memory contains enlarged background and foreground objects instead of adjacent frames. 88. Dimensionalization (Andrew Lippman and Henry Holtzman) Images from multiple still and cinema cameras aimed at the same event are merged into a four-dimensional (x, y, z, t) visual database of the scene to allow multiple perspectives, relighting, and new picture content. Ultimately, this approach might allow the viewer to roam through the set, taking the position of the camera operators or anyplace in between. Initial work addressed static scene elements ("Lucy's Kitchen"); current work extends this to include moving elements, live actors, and the mixture of still photographs with movie footage. 89. Casual Collaboration (Andrew Lippman and Judith Donath) Video images are used to create a visual and interactive representation of an on-line collaborative community. A database format is developed which permits the modification and reuse of the basic images to represent changing events in the visualized community. The research investigates perceptual issues in synthesizing a coherent scene from disparate parts, social issues in the visual depiction of a community, and technical issues in the integration of live and processed video. 90. Structure out of Sound (Andrew Lippman, Professor Marvin Minsky, and Michael Hawley) In an information-rich environment where data, images, and sound are readily accessible and digitally communicated, the issue of content- based search becomes a necessity. Structure out of Sound is the first attempt at a unified analysis tool for speech, music, and sound effects. Movies are analyzed into sonic primitives that allow one to divide a movie into dialogue and action or to identify the presence of a single actor. The initial work, a doctoral thesis, lays out the groundwork for later addition of visual browsing and correlating elements. 91. Hyperinstruments (Professor Tod Machover) Hyperinstruments is a project which attempts to define and produce what we consider to be the models for musical instruments of the future. These prototypes combine new definitions of musical virtuosity with intelligent machine understanding and music structure generation. Efforts continued during the past year to turn our "HyperLISP" environment into a general research tool, one which is currently employed by various researchers at the Media Lab and at various other centers and institutions. Work on the automated music generation and analysis system Cypher was completed, and is the subject of a book to be published soon by the MIT Press. Various music cognition studies into phenomenon such as beat and phrase tracking have yielded intelligent algorithms which are being incorporated into our systems. Research is continuing on turning physical gesture (notably a conductor's left-hand articulations) into real-time control signals, using specifically designed hand-tracking technology. Special emphasis has been placed on the physical and sonic detection of existing acoustic musical instruments, most notably stringed instruments, including joint-angle movement sensing, finger-position sensing, bow-position sensing, and special digital signal processing techniques for pitch, timbre, and phrase analysis, including some using synchronized dot patterns. Several new musical compositions, including one for the cellist Yo-Yo Ma, have been produced and performed using our hyperinstrument techniques.